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ABSTRACT 

Several  studies  suggest  that  the  use  of  geometric  features 
along  with  spectral  information  improves  the  classification 
and  visualization  quality  of  hyperspectral  imagery.  These 
studies  normally  make  use  of  spatial  neighborhoods  of  hy¬ 
perspectral  pixels  for  extracting  these  geometric  features.  In 
this  work,  we  merge  point  cloud  Light  Detection  and  Rang¬ 
ing  (LiDAR)  data  and  hyperspectral  imagery  (HSI)  into  a 
single  sparse  modeling  pipeline  for  subpixel  mapping  and 
classification.  The  model  accounts  for  material  variability 
and  noise  by  using  learned  dictionaries  that  act  as  spectral 
endmembers.  Additionally,  the  estimated  abundances  are 
influenced  by  the  LiDAR  point  cloud  density,  particularly 
helpful  in  spectral  mixtures  involving  partial  occlusions  and 
illumination  changes  caused  by  elevation  differences.  We 
demonstrate  the  advantages  of  the  proposed  algorithm  with 
co-registered  LiDAR-HSI  data. 

1.  INTRODUCTION 

HSI  sensors  acquire  images  in  which  each  pixel  contains  nar¬ 
rowly  spaced  measurements  of  the  electromagnetic  spectrum, 
allowing  spectroscopic  analysis.  The  data  acquired  by  these 
spectrometers  play  significant  roles  in  biomedical,  environ¬ 
mental,  land- survey,  and  defense  applications.  It  contains 
very  high  spectral  resolution,  at  the  expense  of  less  (spatial) 
geometrical  information.  There  are  numerous  intrinsic  chal¬ 
lenges  associated  with  effective  ground  mapping  and  char¬ 
acterization  applications  when  using  overhead  HSI,  see  for 
example  [1-3].  These  are  noise  and  sensor  artifacts,  compli¬ 
cated  energy  interaction  schemes,  intra-class  variability,  and 
spectral  mixing.  Consider  for  example  a  region  in  a  scene 
where  there  are  trees  partially  occluding  a  road.  These  eleva¬ 
tion  differences  cause  single  pixels  to  have  energy  reflected 
from  both  the  tree  leaves  and  the  road,  and  are  also  affected  by 
shade.  This  problem  motivates  the  use  of  additional  informa¬ 
tion  sources  for  potentially  mitigating  these  effects.  LiDAR 
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point  cloud  data  provides  precise  range  information  on  a  three 
dimensional  space.  In  particular,  LiDAR  sensors  acquire  one 
or  multiple  elevation  measurements  per  single  (discretized) 
ground  planar  coordinate.  When  these  point  cloud  data  are 
co-registered  with  the  hyperspectral  pixels,  they  give  insight 
into  identifying  structural  changes,  including  partial  occlu¬ 
sions  within  a  spectral  pixel.  This  advantage  has  motivated 
several  works  to  use  LiDAR  and  HSI  for  improved  classifica¬ 
tion.  For  example,  in  [  ],  the  authors  used  depth  information 
from  LiDAR  as  part  of  the  parametrization  required  for  a 
bio-optical  model  to  perform  underwater  benthic  mapping. 
In  [5],  the  authors  studied  the  possible  correlations  of  the 
surface  roughness  and  minerals’  spectral  content.  In  [6],  Li¬ 
DAR  information  was  used  to  better  localize  small  targets  by 
first  performing  a  background/foreground  segmentation  on 
the  elevation  map,  and  then  using  regions  of  interest  based  on 
height  for  improved  small  target  detection.  It  has  also  been 
applied  for  obtaining  higher  discrimination  between  savanna 
tree  species  by  the  use  of  hand  crafted  decision  trees  [7-5  ] . 
With  the  exception  of  [4],  these  works  require  careful  hand 
tuning  of  decision  operations,  and  require  a  sequential  pro¬ 
cessing  of  LiDAR  and  HSI.  Our  framework,  described  in 
sections  2  and  3,  drifts  away  from  these  approaches  because 
the  model  simultaneously  uses  information  from  both  data 
sources  to  estimate  the  pixels  spectral  abundance  (and  cor¬ 
responding  labels).  The  HSI  cube  is  expressed  as  a  sparse 
linear  combination  of  learned  sources  (dictionary  atoms), 
giving  meaningful  material  abundance  estimates,  without 
explicit  dimension  reduction  or  subspace  projection  prepro¬ 
cessing  steps.  We  impose  spatial  coherence  in  the  sparse 
modeling-based  classification.  This  efficiently  fuses  spectral 
(HSI)  and  structural  (LiDAR)  information  by  incorporating 
local  and  nonlocal  connectivities  between  local  regions  in  the 
scene,  leading  to  a  grouping  criteria  that  induces  a  robust  and 
stable  abundance  mapping.  This  will  be  illustrated  with  real 
data  examples  in  Section  4. 

2.  SPARSE  MODELING  HSI 

Let  each  measured  pixel  y  =  [?/i,  7/2,  •••,?/&]  in  the  hyperspec¬ 
tral  image  be  a  vector  valued  function,  yi  :  5ft2  -A  ,  1  < 
i  <  b,  where  b  denotes  the  number  of  spectral  bands.  We  stack 
these  pixels  in  matrix  format  as  Y  =  [yi,  ...,yn]  E  5R6xn, 


where  n  is  the  total  number  of  available  pixels  distributed 
spatially.  All  the  entries  of  this  matrix  are  nonnegative.  In  ad¬ 
dition,  we  assume  that  the  measured  energy  Y  at  the  sensor  is 
proportional  to  the  area  covered  by  the  dictionary  of  materials 
^3/  and  the  reflectivity  of  the  media,  which  can  be  modeled  as 
the  linear  system  Y  =  ^3/A  +  N,  where  N  is  additive  noise 
with  bounded  energy  (||N|||,  <  cr2),  ^  G  5ft6x/c  is  a  dictio¬ 
nary  (soon  to  be  learned),  and  A  G  5R/cxn  is  the  associated 
matrix  of  coefficients  representing  the  mixture  of  dictionary 
atoms  when  composing  the  data.  The  goal  is  to  learn  the  dic¬ 
tionary  \l>  representing  the  materials,  and  their  proper  combi¬ 
nation  A,  just  from  Y  (unsupervised  case)  or  from  Y  and  a 
labeled  library  of  real  data  (supervised  case).  We  follow  for 
this  the  modeling  procedure  from  [10].  Assuming  there  are 
C  materials,  this  model  aims  to  learn  a  block-structured  dic¬ 
tionary  of  materials,  where  the  j  —  th  block  is  representative 
of  the  j  —  th  material,  j  G  [1, ...,  C\.  Learning  each  material 
subdictionary  can  be  summarized  as  solving  the  following  bi¬ 
convex  optimization  problem, 


arg  min 
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2 
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+A||ai||1} 
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where  there  are  rij  pixels  pertaining  to  the  j  —  th  class, 
A  =  [ai,...,anJ  G  3tkjXnj  (each  class-dictionary  can  po¬ 
tentially  have  a  different  size  kj ),  and  A  >  0  is  a  parameter 
that  controls  the  trade-off  between  reconstruction  quality  and 
sparsity.  After  learning  the  material  dictionaries  in  a  sepa¬ 
rate  fashion,  the  structured  dictionary  \l>  =  [SI/1, ...,  \l> c ]  is 
assembled  and  used  for  solving  for  the  corresponding  abun¬ 
dance  coefficients  originated  from  a  linear  combination  of 
atoms  from  SI/ : 


n 

A*  =  argmin  V  ^(y*>  a*).  (2) 

A^O  z ' 

i=  1 


Equations  (1)  and  (2)  provide  sufficient  information  for  HSI 
subpixel  mapping.  If  we  define  a  per-block  sum  operator 
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5ft1  x  kj  is  a  vector  of  ones  corresponding  to  the  number  kj  of 
atoms  per  subdictionary,  then,  a  mapping  /( y)  :  !ft6  — )►  5ftc  is 
/( y)  =  Ma,  corresponds  to  the  £i-norm  per  material,  yield¬ 
ing  fractional  abundance  estimates  (see  [l1  ]  for  details). 


3.  HSI-LIDAR  FUSION 

We  want  to  minimize  the  effects  of  partial  occlusions  in  the 
scene,  and  since  there  could  be  more  than  one  LiDAR  re¬ 
turn  sample  per  discrete  spatial  coordinate,1  we  pick  the  data 
sample  with  the  minimum  elevation  value  per  discrete  spa¬ 
tial  (x,  y) -coordinate.  Similarly,  we  use  the  average  intensity 
value  per  discrete  spatial  (x,  y) -coordinate.  We  concatenate 
these  two  values  into  a  single  vector  G  5ft2,  Mi  G  [1, ...,  n\. 
The  purpose  behind  this  procedure  is  to  enforce  spatial  ho¬ 
mogeneity  in  the  spectral  abundance  estimates  in  the  regions 
where  the  LiDAR’ s  active  signal  reached  farthest. 

Up  to  this  point,  each  pixel  is  treated  independently  from 
each  other.  To  exploit  the  structural  scene  information  avail¬ 
able  from  LiDAR,  one  can  enforce  the  estimation  of  the  abun¬ 
dance  coefficients  A  to  be  influenced  by  the  spatial  geometry 
of  the  point  cloud  data,  hence  inducing  spatial  and  spectral 
coherence  in  the  abundance  estimation  process.  This  coher¬ 
ence  will  depend  both  on  the  pixels’  spectral  shape  and  the 
geometry  of  LiDAR  data.  Let  T  be  a  collaborative  term  on 
the  coefficients, 

^(M, Wjjaj)  =  ||(Maj  -  (3) 

ler) 

where  y  denotes  a  predefined  neighborhood  associated  to  the 
i  —  th  pixel.  T  will  highly  depend  on  the  weighting  function 
wu .  An  example  of  such  a  function  is 

l  /  lly»~yz  111  \  /  II tj-ti  III  \ 

wu  =  —(aex p  ;  +(1  —  a)  exp  ),  (4) 

where  a  G  [0, 1]  controls  the  contribution  of  each  data  source, 
Zi  is  a  pixel-dependent  normalization  constant  such  that 
J2ier)  mi  =  1,  r  is  a  vectorized  spatial  window  (patch) 
around  each  of  the  concatenated  LiDAR  range  and  intensity 
samples  r,  and  cr2,  a2  are  density  parameters  for  the  spectral 
and  range  content,  respectively,  controlling  the  width  of  the 
weighting  function  (here  set  to  be  the  average  of  the  data’s 
pairwise  Euclidean  distance,  either  local  for  each  pixel  or 
global  for  the  whole  data).  This  weighting  function  is  close 
to  1  if  the  both  the  hyperspectral  pixels  and  the  LiDAR  local 
coordinates  are  homogeneous,  and  0  otherwise. 

Finally,  our  proposed  mapping  approach  remounts  to 
solving  the  optimization  problem 

n 

A*  =  argmin  E(^(Y>  a»)  +  ^(M,  w*;  a*)},  (5) 

i= 1 

where  (3  >  0  is  a  parameter  controlling  the  amount  of  collab¬ 
oration  between  LiDAR  and  HSI  data  samples.  Notice  that 
^(M,  w$;  aj  introduces  a  variable  coupling.  We  efficiently 
solve  this  using  the  Split  Bregman  method  [11]  and  primal 

Tn  this  work,  we  exploit  the  LiDAR’s  point  cloud  data.  Rasterized  data 
could  also  be  used  as  an  alternative  data  source. 


decomposition.  First,  we  reformulate  Equation  (5)  as  to  solv¬ 
ing  the  following  constrained  optimization  problem  (for  each 
pixel): 

n 

minait0^{'H(yi,^;ai)  +  Hv^!  +  ||uj||i} 

s.t.  Vi  =  ai;Ui  =  Maj  -  e  [l,n].  (6) 

lerj 

Second,  the  constraints  are  enforced  by  applying  an  Aug¬ 
mented  Lagrangian  formulation: 

£(a,v,u,b,c)  =  U( y,  ^;a)  +  ||v||j.  +  ||u||i 

+A  <  b,  a  —  v  >  +/3  <  c,  g  -  u  >  +  ^||a-  v||| 

+  ||]g-u|li>  (7) 

where  g  =  w/Maj,  and  we  maximize  for  the  dual  vari¬ 
ables  b  and  c,  and  minimize  for  a,  v,  and  u.  Finally,  the 
proposed  abundance  mapping  algorithm  is  reduced  to  solving 
the  following  subproblems  independently: 

at+1  =  argmin{i||^y-at||i  +  ^||at-vt  +  bt||i 

aXO  Z  Z 

+  ^l|Ma*  —  g*  +  c*  1 1 2  } ,  (8) 

vt+1  =  argmin{||v‘||1  +  ^||a‘+1-vt  +  b‘||l},  (9) 

vXO  Z 

ut+1  =  argmin{||u‘||i  +  ^||Ma*+1-g*+1+c*|||},  (10) 

bt+1  =  bt  —  vt+1  +  at+1,  (11) 

ct+1  =  ct  —  gt+1  +  Mat+1  —  ut+1.  (12) 

These  subproblems  are  solved  until  convergence  in  the  £2- 
norm  of  a,  which  takes  about  50  iterations  in  our  experiments. 
Note  that  the  subproblems  can  be  solved  simply  via  inversion 
(Equation  (8)),  shrinking  (equations  (9)  and  (10)),  and  explic¬ 
itly  (equations  (11)  and  (12)),  see  [1 1  ]  for  more  details.  2  This 
concludes  the  subpixel  modeling  procedure.  Full-pixel  label¬ 
ing  derives  directly  by  selecting  the  i  —  th  pixel’s  label  corre¬ 
sponding  to  the  maximum  element  of  Ma^ .  We  now  proceed 
with  experiments  supporting  our  model. 

4.  EXPERIMENTS 

In  this  section,  we  validate  our  model  by  applying  it  to  co¬ 
registered  HSI  and  LiDAR  data.  This  dataset  consists  of 
an  airborne  data  collection  over  Gulfport,  Mississippi,  in 
November,  2010.  The  scene  is  composed  of  low  density 
urban  and  coastal  regions.  The  HSI  data  was  acquired  with 
a  CASI- 15 00  sensor,  with  a  spectral  range  of  375-1050  nm 

2 The  nonnegativity  constraint  is  enforced  by  projecting  into  nonnegative 
numbers. 


in  72  bands.  The  LiDAR  data  was  acquired  with  an  Optech 
ALTM  Gemini  sensor,  operating  at  a  wavelength  of  1064 
nm.  These  data  are  co-registered  at  1  m  spatial  resolution, 
with  a  total  of  324  x  500  pixels.  We  analyze  two  scenarios: 
supervised  and  unsupervised  mapping.  On  both  scenarios, 
we  compare  the  mapping  results  with  and  without  LiDAR 
information,  hence  highlighting  the  benefits  of  the  proposed 
joint  modeling  scheme.  We  selected  A  =  ^=,  and  a  =  0.9. 

The  neighborhood  77  for  each  pixel  patch  of  3  x  3, 3  was 
composed  using  4  spatially  connected  overlapping  patches 
and  the  4  most  similar  patches  across  the  entire  image  spatial 
domain.  All  experimental  results  are  illustrated  in  Figure  1. 
Subfigures  1(a)  and  1(b)  show  false  color  composites  from 
the  scene  for  the  LiDAR  and  HSI  data,  respectively. 

On  a  supervised  setting,  we  used  a  priori  averaged  spectra 
from  11  materials.  These  materials  are  labeled:  Cl:  canvas , 
C2:  fabric  #7,  C3:  fabric  #2,  C4:  trees,  C5:  healthy  grass, 
C6:  grounds,  C7:  asphalt,  C8:  red  roof,  C9:  brown  roof, 
CIO:  tan  roof,  and  Cll:  sand.  These  spectra  served  as  the 
dictionary  ^37.  We  processed  the  data  by  using  the  proposed 
mapping  algorithm  for  f3  =  0  and  /?  =  ^=,  that  is,  with  and 
without  fusion.  Subfigures  1(e)  and  1(f)  show  these  full-pixel 
mappings.  Notice  how  the  estimates  are  smoother  in  Subfig¬ 
ure  1(f),  for  instance,  a  more  homogeneous  region  around  the 
red  building  on  the  lower  left  of  the  image.  Also,  there  are 
grounds  pixels  that  are  incorrectly  labeled  as  concrete ,  and 
are  correctly  labeled  by  activating  the  fusion  term  in  the  pro¬ 
posed  model. 

On  an  unsupervised  setting,  we  followed  the  endmember 
learning  procedure  from  [1  ].  Basically,  \F  is  initialized  using 
a  single  estimated  spectra  for  each  of  the  C  materials  using  a 
nonnegative  matrix  factorization,  where  the  abundance  coef¬ 
ficients  are  constrained  to  sum  to  one,  and  continues  adding 
atoms  to  each  subdictionary  until  the  change  in  recon¬ 
struction  error  reaches  1  x  10-4.  We  applied  the  proposed 
fusion  algorithm  after  learning  T'  with  C  m  11  materials 
(labeled  as  U’s  in  Figure  1).  In  Subfigure  1(d),  we  show  a 
spectral  sample  reconstructed  using  the  proposed  model  cor¬ 
responding  to  a  small  tree  under  the  shade  from  a  taller  build¬ 
ing.  This  sample  is  compared  with  the  original  HSI  sample, 
and  the  average  (supervised)  spectra  from  the  trees  class.  The 
fused  spectra  shows  a  higher  amplitude  in  the  channels  cor¬ 
responding  to  green  and  red  wavelength.  This  is  due  to  the 
collaboration  effect  of  the  proposed  model,  enforcing  homo¬ 
geneous  regions  from  LiDAR  and  HSI  to  have  similar  abun¬ 
dance  values.  Subfigures  1(g)  and  1(h)  illustrate  the  full- 
pixel  mappings  with  and  without  fusion.  Again,  we  observe 
a  smoother  mapping  in  Subfigure  1(h).  Finally,  in  Subfigure 
1(c),  we  illustrate  the  false  color  composite  after  applying  the 
proposed  model  (in  an  unsupervised  manner).  Notice  how  the 
effect  of  shading  caused  by  the  sun  in  the  HSI  scene  is  signif- 


3  Spatial  patches  were  used  for  the  LiDAR  depth  and  intensity  data.  Single 
pixels  were  used  for  HSI  data. 


Fig.  1.  Fusion  of  HSI  and  LiDAR  data  from  the  Gulfport  scene,  (a)  Depth-intensity- average  map  from  LiDAR.  (b)  False  color  RGB  from 
hyperspectral  scene,  (c)  False  color  RGB  from  HSI-LiDAR  fused  scene,  (d)  Influence  of  LiDAR  data  into  spectral  estimation,  pixel  (162,160). 
(e)  Supervised  spectral  mapping,  no  fusion,  (f)  Supervised  spectral  spectral  mapping  with  fusion,  (g)  Unsupervised  spectral  mapping,  no 
fusion,  (h)  Unsupervised  spectral  mapping  with  fusion.  (This  is  a  color  figure.) 


icantly  alleviated  in  the  new  representation. 

5.  CONCLUSION 

We  presented  a  sparse  modeling  algorithm  for  source  sepa¬ 
ration  and  classification  using  hyperspectral  imagery  and  Li¬ 
DAR.  The  range  information  from  LiDAR  data  provides  rich 
structural  information,  and  is  used  to  enhance  the  classifica¬ 
tion  performance  in  HSI.  An  affinity  function  that  combines 
the  spectral  information  along  with  the  spatial  information  in 
LiDAR  is  incorporated  in  the  model  to  promote  collaboration 
between  the  two  data  sources.  The  proposed  unsupervised  al¬ 
gorithm  learns  a  structured  dictionary  representing  the  spec¬ 
tral  sources/endmembers,  and  expresses  each  pixel  as  a  sparse 
linear  combination  of  the  dictionary  atoms.  These  coefficients 
provide  information  for  spectral  abundance  mapping  and  clas¬ 
sification.  We  performed  experiments  using  real  HSI/LiDAR 
data  illustrating  the  advantages  of  multimodal  information  for 
remote  sensing  applications.  In  particular,  we  showed  how 
using  this  model  alleviates  the  effects  of  partial  occlusions 
caused  by  elevation  differences  and  shading. 
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